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METHOD AND SYSTEM FOR DETECTING ARTIFACTS IN ICU PATIENT 
RECORDS BY DATA FUSION AND HYPOTHESIS TESTING 
The present invention relates generally to expert systems, and more particularly to 
an expert system for use in evaluating data from a patient. 

Healthcare technology (e.g., biomedical sensors, monitoring systems and medical 
devices) is rapidly advancing in capability as well as sheer prevalence (numbers of 
5 devices) in the modern intensive care unit (ICU). The creation of additional data streams is 
imposing a significant "informat ion-overload" challenge upon healthcare staff that also 
faces a critical shortage of intensive care staff to meet the needs of the ICU patient 
population. 

The present invention is therefore directed to the problem of developing a method 
1 0 and apparatus for reducing the amount of information that must be processed manually in 
an intensive care environment or other healthcare environment. 

The present invention solves this and other problems by providing a method and 
apparatus composed of intelligent modules, which are capable of assimilating multiple data 
streams originating from a broad array of sensors and systems and able to distinguish 
15 clinically-significant changes in patient states from clinically- insignificant changes or 
artifacts. 

According to one aspect of the present invention, an exemplary embodiment of a 
method for monitoring a patient includes employing hypothesis testing against each of 
several monitored signals to determine whether an artifact is present in the monitored 

20 signals. In the hypothesis testing, a null hypothesis includes an assumption that pairs of 
samples of highly correlated monitored signals of the several monitored signals have a 
predetermined distribution. The exemplary embodiment of the method then determines 
that an artifact may exist in one of the plurality of monitored signals when a likelihood that 
the null hypothesis is true falls below a predetermined confidence level. In general, the 

25 hypothesis test indicates whether the data being obtained matches historical data from 
patients with similar conditions. The data being matched includes pairs of samples of 
monitored signals against historical versions of the same monitored signals, which pairs are 

highly correlated monitored signals. 

According to another aspect of the present invention, an exemplary embodiment of 
30 a method for detecting an artifact in one or more samples (si ... s„) of monitored signals 
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(Si ... . Sn) includes: calculating, for each (s m ) of the one or more samples (si . . . s„) of 
monitored signals (S ; . . S„), a cross probability {p m k) of observing each sample (s m ) and 
another sample (s k ) assuming a null hypothesis is true, wherein the null hypothesis (H 0 ) is 
that each sample (s m ) and each other sample (s k ) have the same distribution as stored 
5 versions; calculating a confidence {c mk ) level associated with each of the cross probabilities 
(p m k); repeating the calculating steps for all combinations of pairs of highly correlated 
monitored signals; summing, for each sample (s,„), all of the cross probabilities (p mk ) 
associated with a pair of highly correlated signals (S mk ) that includes the sample (s m ); and 
outputting a result for each sample (s m ) as a probability of not including an artifact in each 

10 sample, wherein if one or more of these probabilities of not including an artifact lies below 
a predetermined threshold indicating to a user that one or more samples associated with 
one or more of the probabilities may include an artifact. 

According to another aspect of the present invention, an exemplary embodiment of 
an apparatus for monitoring a patient includes multiple leads, a memory and a processor. 

15 Each of the leads receives a sample of a monitored signal. The memory stores each of the 
received samples of the monitored signals. The processor is coupled to the memory and is 
programmed to: employ hypothesis testing against each monitored signal to determine 
whether an artifact is present in the monitored signals, in which a null hypothesis includes 
an assumption that pairs of samples of highly correlated monitored signals of the 

20 monitored signals have a predetermined distribution; and determine that an artifact may 

exist in one of the monitored signals when a likelihood that the null hypothesis is true falls 

■ 

below a predetermined confidence level. The apparatus may include a user interface to 
output this information to a user in a meaningful manner. 

According to yet another aspect of the present invention, the methods herein may 
25 be encoded in computer readable media as instructions for a processor. 

Other aspects of the invention will be apparent to those of skill in the art upon 
review of the detailed description in light of the following drawings. 

FIG 1 depicts a block diagram of an exemplary embodiment of a method for 
processing multiple data streams according to one aspect of the present invention. 
30 FIG 2 depicts a flow chart of an exemplary embodiment of a method for monitoring 

a patient according to yet another aspect of the present invention. 
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FIG 3 depicts a flow chart of another exemplary embodiment of a method for 
monitoring a patient according to still another aspect of the present invention. 

FIG 4 depicts a block diagram of an apparatus for monitoring a patient according to 
still another aspect of the present invention. 
5 Any reference herein to "one embodiment" or "an embodiment" means that a particular 
feature, structure, or characteristic described in connection with the embodiment is 
included in at least one embodiment of the invention. The appearances of the phrase "in 
one embodiment" in various places in the specification are not necessarily all referring to 
the same embodiment. 

10 The present invention provides inter alia a method for using a system composed of 

intelligent modules, which are capable of assimilating multiple data streams originating 
from a broad array of sensors and systems and are able to distinguish clinically-significant 
changes in patient states versus clinically-insignificant ones or artifacts. The present 
invention includes a method for data stream fusion, which will enable multi-parameter 

1 5 monitoring capabilit ies. 

According to one aspect of the present invention, the present invention also 
includes a method for detecting artifacts in a given monitored signal (or a set of monitored 
signals) based on statistical analysis. Hypothesis testing can be used to determine whether 
the monitored signal is the result of an artifact or a significant clinical change. A 

20 hypothesis test is a procedure for determining whether an assertion about a certain 
population characteristic is statistically reasonable. 

According to one aspect of the present invention, hypothesis testing can be used to 
determine whether one or more artifacts are present in recently obtained data. For 
example, recently obtained data is hypothesized to have a distribution that follows the 

25 distribution of similar data obtained over a long term period of time (or across many 

patients). If the hypothesis turns out to be true (within some confidence interval), then the 
recently obtained data is more likely not to contain artifacts, whereas if the hypothesis is 
not true, then the converse is more likely. By employing highly correlated signals in pairs 
as the basis for the hypothesis test, increasing confidence can be obtained in the result. 

Assume there are a number of monitored signals (sj. s 2 , 6*5, .... s„) t for which one 
needs to have an indicator for the presence of artifacts in every signal. The process starts 
by running off-line correlation testing among recorded banks of these signals. ECG/EEG 
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signal database sources are publicly available and can be easily obtained. The resulting 
correlation matrix gives an indicator of the cross dependency between every pair of these 
signals and would be of the form: 



r u r \n 



r • • • r 



(i) 



where r, , is the autocorrelation of signal s/ with itself (r//=l) and r in is the cross 
correlation between signals sj and s n . These cross correlation values are needed for the 
statistical analysis as shown in FIG 1. The National Institutes of Health have developed 
such a database from which these cross-correlation values can be obtained. 
5 Hypothesis testing (Null Hypothesis testing with a predefined significance level, 

i.e. confidence interval) is used for determining the probability of the presence of artifacts.. 
A hypothesis test is a procedure for determining if an assertion about a characteristic of a 
population is reasonable. The null hypothesis is the original assertion. In this case, the null 
hypothesis is that the signals under study have the same distribution HO as the database of 

10 similar signals. 

The alternative hypothesis is that the signals under study do not belong to the same 

population HI as the database of similar signals. 

The significance level is related to the degree of certainty required to reject the null 
hypothesis in favor of the alternative. By taking a small sample one cannot be certain 

1 5 about one's conclusion. Tn fact, the smaller the sample the less certain one can be about 
the relationship of the data to a distribution of other data. For example, a single data point 
obtained if several standard deviations from the mean of a sample of similar data could 
either be a statistical anomaly (i.e., the data is valid but just represents a point that has a 
very small likelihood of occurring, but nevertheless can still occur) or could be an artifact. 

20 However, as the number of data points increases the confidence with which one can state 
that the data is anomalous or based on artifacts increases as the likelihood of several data 
points being significantly outside the main distribution becomes increasingly unlikely. 

So one must decide in advance to reject the null hypothesis if the probability of 
observing a sampled result is less than the significance level. Many researchers employ a 

25 significance level of about five percent. For a typical significance level of five percent 
(5%), the notation is a = 0.05. For this significance level, the probability of incorrectly 
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rejecting the null hypothesis when it is actually true is 5%. For more protection from this 
error, a lower value of a should be selected. 

The /?- value is the probability of observing the given sample result under the 
assumption that the null hypothesis is true. If the p-value is less than a, then the null 
5 hypothesis should be rejected. For example, if a = 0.05 and the/?-value is 0.03, then the 
null hypothesis is rejected. The converse is not true. If thep-value is greater than a, then 
one has insufficient evidence to reject the null hypothesis. 

The outputs for many hypothesis test functions also include confidence intervals. 
As the term is used herein, a confidence interval is a range of values that have a chosen 
10 probability of containing the true hypothesized quantity. Suppose, in the foregoing 

example, the sampled value is inside a 95% confidence interval for the mean, u. This is 
equivalent to being unable to reject the null hypothesis at a significance level of 0.05. 
Conversely, if the 100(1- a) confidence interval does not contain include the p-value, then 
one rejects the null hypothesis at the a level of significance. 

Based on prior testing of the monitored signals under study {s { , s 2 , sj t s„) as 
stored in the databanks, and identifying correlated signals as in equation (1), we set a 
certain threshold to determine the accepted level of correlation (e.g., reject any correlation 
factor nj less than 40%). We repeat the same experiment (which results in a different 
correlation matrix) for every clinical condition under examination (e.g., angina, bleeding, 
brain injury, pulmonary edema, cord compression, metabolic coma, respiratory failure, 
etc). Each clinical condition will have its own correlation matrix, which describes the 
success of having any two signals pass hypothesis testing when compared against each 
other. For example, as shown in FIG 1, signals Si and s 2 have a certain correlation factor 
r J2 , and a certain range ofp-values {e.g., p m inAn 8 ina, P^Angina) in the case of angina, which is 
different from the case of respiratory failure. The closer the currently produced values 
from the normal range, the higher the corresponding weight it has. This is measured by 
assigning more weight to the p values closer to the nominal ones. For example, 

Pi. J ~~ {Ptjmtx A ngina Pi.j min Angina v (y\ 

PiJ = —[ ~ ] C '-J 1 ' 

\PiJ max Angina PiJ min Angina ) 

1 5 where dj is a confidence factor (1-cummulative distribution) . 

Summing these individual values, the probability that the signal under study has no 
artifacts in it is: 



1 



WO 2005/076187 PCT/IB2005/050417 



Pnoartifacts in signal i PiJ 

J 

j 

Where the sum j is over all the signals that highly correlate with signal i. 
Obviously, the probability of having artifacts in signal i is (1 --Pnoartifacts in signal/). FIG 1 
shows a block diagram of the above described process. 

Turning to FIG 2, shown therein is an exemplary embodiment of a method for 
5 determining whether an artifact is present in monitored signals obtained from a patient 
being treated or observed for a specific clinical condition. For example, the monitored 
signals could be electrocardiograms, respiration rates, heart rates, or other signals that 
provide information as to a patient's health. This method enables a medical clinician or 
operator to focus on those aspects of the patient that include clinically significant changes 
1 0 as opposed to artifacts, which need to be addressed differently. In the case of artifacts, the 
signal leads need to be checked to verify data integrity, which can be accomplished by a 
technician, for example. In the case of a clinically significant change, a physician or 
specialist may be required to review the data to determine the appropriate response. 

In step 21, a cross-correlation matrix as follows: 

Ml M« 
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is calculated for the monitored signals. The cross-correlation matrix can be obtained from 
15 a database of the monitored signals obtained from patients under similar clinical 

* 

conditions. This matrix quantitatively describes the relationship or correlation between the 
monitored signals. Thus, r tj quantifies the cross-correlation between two monitored signals 
(Si and Sj). For example, r n represents the correlation between a signal (Si) and itself, 
which is one. This matrix may vary from clinical condition to clinical condition; therefore 

20 each cross-correlation matrix should be obtained from stored monitored signals that were 
observed from a multitude of patients having the same clinical condition. 

In step 22, those pairs of signals that are highly correlated are identified. For 
example, every value in the matrix above 0.40 or 40% indicates two signals that are highly 
correlated. In this case, those signals that have cross-correlation values above this 

25 predetermined threshold, such as 40%, are identified as highly correlated. These pairs of 
signals can be used to authenticate the data, as signals that are highly correlated should 
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have samples that are similarly correlated. If not, then the likelihood increases that the 
samples are tainted. 

In step 23, the maximum and minimum probability values are determined from the 
database for each cross-correlated pair. For example, the minimum value for observing 
5 both signals is identified from the database and the maximum value as well. 

In step 24, for each monitored signal (Si) for which a sample (si) exists (the sample 
of interest), a probability (p 0 ) of observing the sample (si) of the monitored signal (£) 
along with a sample (sj) from one of the highly correlated monitored signals (Si) is 
determined. These probabilities (p 0 ...pik) are determined for every highly correlated signal 

10 (for which there is a sample) for the given sample of interest. For example, if signals S 2 , S 5 
and S a (as shown in FIG 1) were highly correlated with respect to signal S h then the cross- 
probabilities piupn and p, 8 are determined. The associated confidence value (c, y ) is also 
determined for each determined probability, which in this case are c J2 , c J5 and c Js . The 
probability of two samples of the monitored signals being observed is calculated under the 

1 5 assumption that the two samples have the same and predetermined probability distribution. 
For example, a probability of observing a first ten minute sample of a first signal, such as a 
ECG I lead signal, and a second ten minute sample of a second signal, such as a ECG II 
lead signal, is calculated assuming the first and second samples follow a predetermined 
population distribution (such as a normal distribution or the same distribution as the 

20 database of monitored signals). This can be accomplished using the Kolmogorov-Smimov 
test, which tests the null hypothesis that the population distribution from which the paired 
data sample is drawn conforms to a hypothesized distribution. If the null hypothesis is true 
as determined by the chosen hypothesis test, a probability is generated for observing the 
data samples, along with an associated confidence value or interval. If the null hypothesis 

25 is not likely to be true based on the hypothesis test, then the sample may be tainted with an 
artifact. 

In step 25, the calculated probabilities are then weighted using the range of 
probabilities for the given medical condition. The end result is the probability of the 
sample not having an artifact. The probability of the sample having an artifact is simply 

30 one minus this probability. 

In step 26, the weighted probabilities are summed over all highly correlated signals 

for a given sample. 
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In step 27, a result is output for each sample as a probability of not including an 
artifact in each sample. If one or more of the probabilities of not including an artifact lies 
below a predetermined threshold, a user is informed that one or more samples associated 

■ — * 

with the probability may include an artifact. 
5 In step 28, the determination is made if more data exists, in which case the above 

process is repeated continuously as long as new samples exist. 

Turning to FIG 3, shown therein is another exemplary embodiment of a method for 
monitoring a patient according to another aspect of the present invention. 

In step 31, several monitored signals are received from the patient, each of which 
10 provides information as to the health of the patient. These signals can be EEG signals, 
ECG signals, ABP signals, respiration signals, brainwaves, etc. 

In step 32, hypothesis testing is employed against each of several monitored signals 
to determine whether an artifact is present in the monitored signals. In the hypothesis 
testing, a null hypothesis includes an assumption that pairs of samples of highly correlated 
15 monitored signals of the several monitored signals have a predetermined distribution. The 
predetermined distribution may include the same distribution as corresponding pairs of 
stored versions of the monitored signals or some standard probability distribution, such as a 
Gaussian distribution. The hypothesis testing will generate a probability that each of the 
monitored signals includes an artifact. 
20 In step 33, it is determined that an artifact may exist in one of the several monitored 

signals when a likelihood that the null hypothesis is true falls below a predetermined 
confidence level. 

In step 34, an output signal is generated to alert an operator that at least one of the 
monitored signals includes an artifact when the probability generated in step 32 exceeds a 

25 predetermined threshold. 

Turning to FIG 4, shown therein is an apparatus for processing data being received 
from a patient being monitored for a medical condition. The apparatus can be part of a 
system of intelligent modules, each of which processes information from the patient so that 
a clinician or physician can be instantly notified or alerted as to a change in a clinical 
30 condition of the patient being monitored. 

A processor 41 performs the above mentioned methods to identify the presence of 
artifacts in the data from the patient. Other modules may identify a clinically significant 



WO 2005/076187 



PCT/IB2005/050417 



change, and the identification of an artifact being present in the data can be used to filter 
out changes caused by artifacts and changes caused by changes in the condition of the 
patient. Any processor capable of performing matrix manipulations of thousands of 
samples should be sufficient to carry out the methods set forth herein. One possible 
5 processor includes the Intel Pentium processor. 

One or more leads 45 transmit the samples to the CPU 41. These leads can be 
standard ECG leads or a communication system that forwards data from a patient to a 
central processing section for further processing. These leads could be wireless or wired. 
In the case of wireless, the leads could be a single or multiple antennae. In the case of 
10 wired leads, the leads could be a single lead that carries multiple signals or a single lead for 
each sample. 

A memory 43 stores any information necessary for the processor 41. For example, 
memory 43 can be a computer readable media that has encoded thereon instructions for 
programming the processor 41. Memory 43 can also be a database that stores all incoming 

15 samples for subsequent processing by the processor 41 so that samples can be accumulated 
while the processor 41 is evaluating prior samples. The memory 43 stores these samples so 
that the processor can reuse them for fine-tuning its analysis by, e.g., using more and more 
data during each iteration to better evaluate the incoming samples. A memory of 50 
gigabytes should be sufficient for this purpose. Memory 43 can be random access memory 

20 or other memory in which data can be written to as well as read from. 

The CPU 41 is shown coupled to a database 42 that stores historical versions of the 
monitored signals of interest. This coupling can be in the form of an actual 
communications connection so that in real time the CPU 41 can obtain the desired 
parameters discussed above. Alternatively, this coupling can be figurative in that the 

25 desired parameters are obtained from the database 42 and then programmed in the 

processor or stored in the memory 43. An example of this database is being developed by 
NIH, as mentioned above. 

A user interface 44 is coupled to the processor 41 so that an operator can be 
informed if an artifact is present in the samples being received from the patient. The 

30 operator can be shown the calculated probabilities, as well as the associated confidence 

levels. Moreover, an alert can be generated when an artifact is detected, which alert can be 
in the form of a audio or visual indicator. 
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It will be appreciated that modifications and variations of the invention are covered 
by the above teachings and are within the purview of the appended claims without 
departing from the spirit and intended scope of the invention. While the above 
embodiments discuss certain weighting technique for weighting the various probabilities 
prior to summing them, other weighting techniques could be employed as well. Moreover, 
while the above embodiments describe certain hypothesis tests, other hypothesis tests could 
be used. Furthermore, these examples should not be interpreted to limit the modifications 
and variations of the invention covered by the claims but are merely illustrative of possible 
variations. 
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